Introduce harmonia-file-* libraries#1031
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughChuck Norris doesn't refactor monoliths—he round-house kicks them into perfectly shaped crates. This PR introduces Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
bytecodealliance/cap-std#414 this yak shave happened |
dad31f6 to
11f2fd4
Compare
f44946c to
30728e7
Compare
bdee131 to
6901197
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1031 +/- ##
==========================================
- Coverage 66.00% 65.62% -0.38%
==========================================
Files 150 158 +8
Lines 17332 17453 +121
Branches 17332 17453 +121
==========================================
+ Hits 11440 11454 +14
- Misses 5216 5276 +60
- Partials 676 723 +47
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
21842b4 to
6f499a9
Compare
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (1)
harmonia-file-nar/src/archive/mmap.rs (1)
82-84: ⚡ Quick winChuck Norris doesn't need
is_empty()— he knows when a file is empty just by looking at it. But Clippy doesn't have that superpower.Adding
len()without a correspondingis_empty()will triggerclippy::len_without_is_empty.🦵 Roundhouse kick this lint away
pub fn len(&self) -> usize { self.len } + + pub fn is_empty(&self) -> bool { + self.len == 0 + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harmonia-file-nar/src/archive/mmap.rs` around lines 82 - 84, The `len()` method implemented in the struct needs a corresponding `is_empty()` method to satisfy the `clippy::len_without_is_empty` lint. Add an `is_empty()` method immediately after the `len()` method that returns a boolean indicating whether the length is zero. The `is_empty()` method should follow the standard Rust convention of returning true when self.len equals zero and false otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/architecture/harmonia-store-structure.md`:
- Around line 56-57: The documentation incorrectly states that harmonia-file-fd
uses cap-tokio as the underlying capability provider, but the actual
implementation uses cap-std with an async wrapper, as confirmed by the crate's
Cargo.toml. Update the description at line 56 to replace "via cap-tokio" with
"via cap-std" to accurately reflect the real implementation. Also apply the same
correction at line 167 where the same inaccuracy appears.
In `@harmonia-cache/src/narlist.rs`:
- Around line 21-39: The get_nar_list function unconditionally treats the input
path as a directory by calling Dir::open_ambient_dir, which fails for file or
symlink paths that were previously supported. Check whether the path is a
directory before attempting to open it with Dir::open_ambient_dir; if it is not
a directory (i.e., it is a file or symlink), handle it appropriately by either
creating a DirSource from the file/symlink directly or passing the path
differently to list_deep. This preserves support for both directory and
non-directory roots as the previous implementation did.
In `@harmonia-file-fd/README.md`:
- Around line 3-15: The README documentation contains outdated references to
incorrect crate names and trait locations. Update line 3 to replace the
reference to `cap-tokio` with the correct async backend (the crate's async
`cap-std` wrapper). Update line 8 to replace `harmonia-file-core` with the
correct trait package name `harmonia-file-io-pure`. Update line 14 to replace
any remaining references to `cap-tokio` with the correct async backend. Ensure
all crate and trait references throughout the README accurately reflect the
actual implementation and dependencies used in this crate.
In `@harmonia-file-fd/src/sink.rs`:
- Around line 29-39: The create_regular_file method has a race condition where
the file is created first on line 34 and then permissions are set afterwards on
line 38-39, leaving a window where the file has default umask-derived
permissions. Instead of calling set_permissions after opening, set the desired
Unix mode atomically during the file creation by adding the mode to the
OpenOptions object before calling open_with. Remove the post-creation
set_permissions call and the associated Unix-specific conditional block that
follows it, replacing it with a method call on the opts object (such as mode()
or similar) that the cap::OpenOptions API provides to set permissions at
creation time.
In `@harmonia-file-fd/src/source.rs`:
- Around line 108-114: The file type determination logic is incorrectly treating
all non-directory and non-symlink files as Regular, which includes FIFOs,
sockets, and device nodes that cannot be safely opened as regular files. In the
file_type determination block around lines 108-114, replace the else clause that
defaults to FileType::Regular with an explicit check using meta.is_file() for
regular files, and add a new error case to return an unsupported file kind error
for any other type. This same issue propagates to the file opening code around
lines 127-133 where unsupported file kinds are then attempted to be opened,
which will be fixed by the fail-fast error from the first change preventing
those types from being labeled as Regular in the first place.
- Around line 134-136: The current code automatically uses mmap for all large
files (size > mmap::MMAP_THRESHOLD) via FileReader::Mmap creation, which is
unsafe for non-immutable sources. Refactor the file reading logic to make mmap
opt-in: add a configuration flag or mode parameter to DirSource/FileReader that
explicitly indicates whether the backing files are immutable, and only use
mmap::MappedFile::from_fd when this immutable mode is explicitly enabled. When
immutable mode is disabled (the default), use streamed reads instead of mmap
regardless of file size.
In `@harmonia-file-io-pure/src/serde_impl.rs`:
- Around line 51-56: The match statement on `h.file_type.as_deref()` currently
uses a catch-all pattern `_` that silently defaults all unknown values to
FileType::Regular, making it impossible to distinguish between a legitimately
missing "type" field (which should default to Regular) and an invalid "type"
value (which should fail deserialization). Change the pattern matching to
explicitly handle the None case separately—allowing it to default to
FileType::Regular—while returning a deserialization error for any unknown or
invalid string values encountered in the Some arm instead of silently coercing
them to Regular.
- Around line 19-23: The Opaque deserializer implementation is tightly coupled
to serde_json::Map, which breaks compatibility with non-JSON formats like
MessagePack or CBOR. Replace the hardcoded serde_json::Map deserialization in
the deserialize method of the Opaque impl block with serde::de::IgnoredAny,
which provides format-agnostic deserialization by consuming any value from any
serialization format without enforcing a specific type structure.
In `@harmonia-file-io-pure/src/source.rs`:
- Around line 68-71: The exists() method in the Source trait currently masks all
errors (permission denied, IO failures, etc.) as false by using is_ok() on the
Result from open(). Introduce a new error-aware method try_exists that returns
Result<bool, Error> to properly distinguish between "file not found" and actual
errors, then update the existing exists() method to call try_exists() and handle
only the "not found" error case as false while propagating or logging other
errors appropriately.
In `@harmonia-file-nar/README.md`:
- Around line 3-10: The README documentation references the retired
harmonia-file-core crate and its APIs (FileSystemSource, FileSystemSink,
MemoryTreeSource, and list_deep) instead of the current harmonia-file-io-pure
crate. Replace all occurrences of harmonia-file-core with harmonia-file-io-pure
throughout the README to ensure users follow the correct, non-deprecated APIs
and avoid compilation failures when implementing the documented examples.
In `@harmonia-file-nar/src/archive/byte_stream.rs`:
- Around line 44-59: The synchronous `path.is_dir()` call in the path handling
logic is blocking the async runtime worker thread, which can cause latency
issues. Replace this blocking filesystem check with a non-blocking variant by
wrapping the synchronous metadata check in `tokio::task::block_in_place()` to
prevent blocking the Tokio runtime, or by using an async metadata retrieval
method if available from the cap-std library being used.
---
Nitpick comments:
In `@harmonia-file-nar/src/archive/mmap.rs`:
- Around line 82-84: The `len()` method implemented in the struct needs a
corresponding `is_empty()` method to satisfy the `clippy::len_without_is_empty`
lint. Add an `is_empty()` method immediately after the `len()` method that
returns a boolean indicating whether the length is zero. The `is_empty()` method
should follow the standard Rust convention of returning true when self.len
equals zero and false otherwise.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: cbd216cf-c308-49ab-88c0-a1ed14e9937e
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (34)
Cargo.tomldocs/architecture/harmonia-store-structure.mdharmonia-cache/Cargo.tomlharmonia-cache/src/error.rsharmonia-cache/src/narlist.rsharmonia-file-core/Cargo.tomlharmonia-file-core/README.mdharmonia-file-core/src/lib.rsharmonia-file-core/src/serde_impl.rsharmonia-file-fd/Cargo.tomlharmonia-file-fd/README.mdharmonia-file-fd/src/cap.rsharmonia-file-fd/src/lib.rsharmonia-file-fd/src/mmap.rsharmonia-file-fd/src/sink.rsharmonia-file-fd/src/source.rsharmonia-file-io-pure/Cargo.tomlharmonia-file-io-pure/README.mdharmonia-file-io-pure/src/canon_path.rsharmonia-file-io-pure/src/lib.rsharmonia-file-io-pure/src/listing.rsharmonia-file-io-pure/src/serde_impl.rsharmonia-file-io-pure/src/sink.rsharmonia-file-io-pure/src/source.rsharmonia-file-io-pure/src/tests.rsharmonia-file-nar/Cargo.tomlharmonia-file-nar/README.mdharmonia-file-nar/src/archive/byte_stream.rsharmonia-file-nar/src/archive/dumper.rsharmonia-file-nar/src/archive/mmap.rsharmonia-file-nar/src/archive/mod.rsharmonia-file-nar/src/archive/restorer.rsharmonia-file-nar/src/lib.rsharmonia-file-nar/src/listing.rs
💤 Files with no reviewable changes (2)
- harmonia-file-core/README.md
- harmonia-file-core/Cargo.toml
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@harmonia-cache/src/narlist.rs`:
- Line 21: The DirSource::open_path function in harmonia-file-fd/src/source.rs
uses tokio::fs::metadata() which follows symlinks, causing a root symlink
pointing to a directory to be incorrectly dereferenced and listed as the target
directory instead of as a symlink. Replace the tokio::fs::metadata() call with
tokio::fs::symlink_metadata() to avoid following symlinks and correctly identify
symlink-to-directory cases. Additionally, add a regression test case that
verifies a store path that is a root symlink to a directory is handled correctly
without being dereferenced.
In `@harmonia-file-nar/src/archive/restorer.rs`:
- Around line 17-28: The restore_to_sink function returns success immediately
after restoring the first root object without verifying that the parser has
reached EOF, which means a stream with a valid root followed by trailing events
or junk data is incorrectly accepted. After the restore_sink_event call
completes for the first root, add a check to ensure the parser has no remaining
events by calling parser.next().await and verifying it returns None, otherwise
return an appropriate error to indicate unexpected trailing data in the stream.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ae457508-d0a9-4eb8-b092-bee729ca26ad
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (34)
Cargo.tomldocs/architecture/harmonia-store-structure.mdharmonia-cache/Cargo.tomlharmonia-cache/src/error.rsharmonia-cache/src/narlist.rsharmonia-file-core/Cargo.tomlharmonia-file-core/README.mdharmonia-file-core/src/lib.rsharmonia-file-core/src/serde_impl.rsharmonia-file-fd/Cargo.tomlharmonia-file-fd/README.mdharmonia-file-fd/src/cap.rsharmonia-file-fd/src/lib.rsharmonia-file-fd/src/mmap.rsharmonia-file-fd/src/sink.rsharmonia-file-fd/src/source.rsharmonia-file-io-pure/Cargo.tomlharmonia-file-io-pure/README.mdharmonia-file-io-pure/src/canon_path.rsharmonia-file-io-pure/src/lib.rsharmonia-file-io-pure/src/listing.rsharmonia-file-io-pure/src/serde_impl.rsharmonia-file-io-pure/src/sink.rsharmonia-file-io-pure/src/source.rsharmonia-file-io-pure/src/tests.rsharmonia-file-nar/Cargo.tomlharmonia-file-nar/README.mdharmonia-file-nar/src/archive/byte_stream.rsharmonia-file-nar/src/archive/dumper.rsharmonia-file-nar/src/archive/mmap.rsharmonia-file-nar/src/archive/mod.rsharmonia-file-nar/src/archive/restorer.rsharmonia-file-nar/src/lib.rsharmonia-file-nar/src/listing.rs
💤 Files with no reviewable changes (2)
- harmonia-file-core/Cargo.toml
- harmonia-file-core/README.md
✅ Files skipped from review due to trivial changes (7)
- harmonia-file-fd/Cargo.toml
- harmonia-cache/src/error.rs
- harmonia-file-io-pure/Cargo.toml
- harmonia-file-nar/README.md
- harmonia-file-fd/README.md
- harmonia-file-io-pure/README.md
- docs/architecture/harmonia-store-structure.md
🚧 Files skipped from review as they are similar to previous changes (22)
- harmonia-file-nar/src/listing.rs
- harmonia-file-fd/src/lib.rs
- harmonia-file-nar/src/archive/mod.rs
- harmonia-cache/Cargo.toml
- harmonia-file-nar/Cargo.toml
- harmonia-file-core/src/serde_impl.rs
- harmonia-file-fd/src/cap.rs
- harmonia-file-io-pure/src/listing.rs
- harmonia-file-fd/src/mmap.rs
- harmonia-file-io-pure/src/tests.rs
- harmonia-file-nar/src/lib.rs
- harmonia-file-io-pure/src/canon_path.rs
- harmonia-file-io-pure/src/sink.rs
- harmonia-file-fd/src/source.rs
- harmonia-file-nar/src/archive/byte_stream.rs
- harmonia-file-core/src/lib.rs
- harmonia-file-io-pure/src/lib.rs
- harmonia-file-fd/src/sink.rs
- harmonia-file-nar/src/archive/dumper.rs
- harmonia-file-io-pure/src/source.rs
- harmonia-file-nar/src/archive/mmap.rs
- harmonia-file-io-pure/src/serde_impl.rs
6677036 to
00c1924
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
harmonia-file-nar/src/archive/restorer.rs (1)
121-129: 💤 Low valueConsider explicitly shutting down file sinks for test parity with production code.
Chuck Norris doesn't need to flush buffers—data commits itself out of respect. But your tests drop the file sinks after
write_all()without callingshutdown(), while production code at line 57 explicitly calls it. This relies onMemoryFileSink's Drop impl to commit data.If the Drop impl ever changes or you swap in a different sink, tests could silently lose data. Adding explicit shutdown keeps tests aligned with the production pattern.
♻️ Suggested fix
- sub.create_child("file") - .await - .unwrap() - .create_regular_file(true) - .await - .unwrap() - .write_all(b"hello") - .await - .unwrap(); + let mut file = sub.create_child("file") + .await + .unwrap() + .create_regular_file(true) + .await + .unwrap(); + file.write_all(b"hello").await.unwrap(); + file.shutdown().await.unwrap();Also applies to: 137-145
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harmonia-file-nar/src/archive/restorer.rs` around lines 121 - 129, The test code creates file sinks and writes data using `write_all()` without explicitly calling `shutdown()`, relying on the Drop impl to commit data. However, production code at line 57 explicitly calls `shutdown()`. To align tests with production behavior and ensure data consistency, add an explicit `.shutdown().await.unwrap()` call after the `write_all()` call in the file sink creation chain around lines 121-129. Apply the same fix to the similar code block at lines 137-145 to maintain test parity with production patterns and prevent silent data loss if the Drop impl changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@harmonia-file-nar/src/archive/restorer.rs`:
- Around line 121-129: The test code creates file sinks and writes data using
`write_all()` without explicitly calling `shutdown()`, relying on the Drop impl
to commit data. However, production code at line 57 explicitly calls
`shutdown()`. To align tests with production behavior and ensure data
consistency, add an explicit `.shutdown().await.unwrap()` call after the
`write_all()` call in the file sink creation chain around lines 121-129. Apply
the same fix to the similar code block at lines 137-145 to maintain test parity
with production patterns and prevent silent data loss if the Drop impl changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 3e50011b-8281-495d-b87a-bc0cbdeccd15
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (34)
Cargo.tomldocs/architecture/harmonia-store-structure.mdharmonia-cache/Cargo.tomlharmonia-cache/src/error.rsharmonia-cache/src/narlist.rsharmonia-file-core/Cargo.tomlharmonia-file-core/README.mdharmonia-file-core/src/lib.rsharmonia-file-core/src/serde_impl.rsharmonia-file-fd/Cargo.tomlharmonia-file-fd/README.mdharmonia-file-fd/src/cap.rsharmonia-file-fd/src/lib.rsharmonia-file-fd/src/mmap.rsharmonia-file-fd/src/sink.rsharmonia-file-fd/src/source.rsharmonia-file-io-pure/Cargo.tomlharmonia-file-io-pure/README.mdharmonia-file-io-pure/src/canon_path.rsharmonia-file-io-pure/src/lib.rsharmonia-file-io-pure/src/listing.rsharmonia-file-io-pure/src/serde_impl.rsharmonia-file-io-pure/src/sink.rsharmonia-file-io-pure/src/source.rsharmonia-file-io-pure/src/tests.rsharmonia-file-nar/Cargo.tomlharmonia-file-nar/README.mdharmonia-file-nar/src/archive/byte_stream.rsharmonia-file-nar/src/archive/dumper.rsharmonia-file-nar/src/archive/mmap.rsharmonia-file-nar/src/archive/mod.rsharmonia-file-nar/src/archive/restorer.rsharmonia-file-nar/src/lib.rsharmonia-file-nar/src/listing.rs
💤 Files with no reviewable changes (2)
- harmonia-file-core/README.md
- harmonia-file-core/Cargo.toml
✅ Files skipped from review due to trivial changes (4)
- harmonia-file-fd/README.md
- harmonia-file-io-pure/README.md
- harmonia-file-nar/README.md
- docs/architecture/harmonia-store-structure.md
🚧 Files skipped from review as they are similar to previous changes (27)
- harmonia-cache/src/error.rs
- harmonia-file-nar/Cargo.toml
- harmonia-file-fd/Cargo.toml
- harmonia-file-nar/src/listing.rs
- harmonia-file-fd/src/mmap.rs
- harmonia-file-core/src/serde_impl.rs
- harmonia-file-io-pure/src/lib.rs
- Cargo.toml
- harmonia-cache/Cargo.toml
- harmonia-file-nar/src/lib.rs
- harmonia-file-io-pure/src/tests.rs
- harmonia-file-nar/src/archive/mmap.rs
- harmonia-file-fd/src/sink.rs
- harmonia-file-io-pure/Cargo.toml
- harmonia-file-io-pure/src/listing.rs
- harmonia-file-fd/src/lib.rs
- harmonia-file-fd/src/cap.rs
- harmonia-file-core/src/lib.rs
- harmonia-file-io-pure/src/serde_impl.rs
- harmonia-file-nar/src/archive/byte_stream.rs
- harmonia-cache/src/narlist.rs
- harmonia-file-io-pure/src/canon_path.rs
- harmonia-file-nar/src/archive/dumper.rs
- harmonia-file-io-pure/src/source.rs
- harmonia-file-fd/src/source.rs
- harmonia-file-io-pure/src/sink.rs
- harmonia-file-nar/src/archive/mod.rs
524841e to
13a1253
Compare
Add harmonia-file-io-pure and harmonia-file-fd implementing a generic async file-tree abstraction mirroring nix's SourceAccessor and FileSystemObjectSink, slim harmonia-file-core to the shared data types, and route NAR dump/restore plus harmonia-cache/narlist through the new FileSystemSource/FileSystemSink traits. The filesystem source wraps the released cap-std in spawn_blocking rather than depending on the unreleased cap-tokio PR, which is only that same sync crate behind spawn_blocking anyway. read_link uses read_link_contents so absolute symlink targets come back verbatim, and the dump strips the case-hack suffix on case-insensitive filesystems to match nix. Co-authored-by: Amaan Qureshi <git@amaanq.com>
Add two new crates and rework two existing ones to implement a generic
async file tree abstraction mirroring nix's
SourceAccessor/FileSystemObjectSinkarchitecture (NixOS/nix#15392):harmonia-file-io-pure(new): Async IO traits (FileSystemSource,FileSystemSink,DirectorySink,RegularFileSink), in-memoryimplementation (
MemoryTreeSource,MemoryTreeBuilder), and listingfunctions (
list_deep,list_shallow). Extracted fromharmonia-file-core— does not do any "real" IO, but usesasyncandcreates opinionated IO abstractions.
harmonia-file-fd(new): FilesystemFileSystemSource(DirSource)and
FileSystemSink(DirSlotSink) viacap-tokio. Usesopenat/fstatat, mmap for large files. Lazy child thunks — directoryhandles opened on demand.
harmonia-file-core(existing): Slimmed down to just the genericdata types (
FileTree,FileSystemObject) and serde. IO traits,listing functions, and in-memory impls moved to
harmonia-file-io-pure.harmonia-file-nar(existing): NAR dump/restore now goes throughFileSystemSource/FileSystemSinktraits viadump_sourceandrestore_to_sink.parse_nar_listingproducesFileTree<NarFileInfo>from NAR streams.
Other changes:
harmonia-cache/narlist.rsusesDirSource+list_deep(async,no
spawn_blocking)Usesthe parts we need of that are inlined.cap-tokio(Bring backcap-async-stdascap-tokiobytecodealliance/cap-std#414) for asynccapability-based filesystem access